Statistics Research Letters (SRL) Volume 3, 2014 


www.srl-journal.org 


Incomplete Polynomial Diagonals-Parameter 
Symmetry Model and Decomposition of 
Incomplete Symmetry Model for Square 
Contingency Tables 

Hiroyuki Kurakami 1 , Akihiro Fujimura 2 , Sadao Tomizawa 3 

‘'^Graduate School of Science and Technology, Tokyo University of Science, 3 Faculty of Science and Technology, 
Tokyo University of Science 
*i,2, 3 Noda City, Chiba, 278-8510, Japan 

' 'h-ku raka m i @m ti .b i gl obe.ne.jp; 2 ak74i. hlww.ocjonrail@gmail.com; 3 tomizawa@is.noda.tus. ac.jp 

Received 13 December, 2013; Revised 10 March, 2014; Accepted 20 March, 2014; Published 18 May, 2014 
© 2014 Science and Engineering Publishing Company 

Abstract 

For square contingency tables with ordered categories, 

Tomizawa (1990) considered the polynomial diagonals- 
parameter symmetry (PDPS) model. The present paper 
proposes the incomplete PDPS model which has the 
structure of PDPS for the partial cells of off-diagonal cells 
except a specified pair of cells ( u,v ) and (v,u) , u u , in 
the table. It also gives the decomposition of incomplete 
symmetry model into the incomplete simple PDPS model 
and the incomplete mean equality model. An example is 
given. 
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Introduction 

For an rxr square contingency table with the same 
ordinal row and column classifications, let p tj denote 

the probability that an observation will fall in the i th 
row and j th column of the table (z = 1 , . . . , r; j = 1 , . . . , r) . 

Goodman (1979) considered the diagonals-parameter 
symmetry (DPS) model defined by 

Pij U h>./>- 

where (/ t] = (j> Jt . This model states that the probability 
that an observation will fall in a cell (i, j) for i < j is 
9 ] _ l times higher than the probability that the 

29 


observation falls in the cell (j,i) . Special cases of this 
model obtained by putting 0 X = • • • = 9 r _ l = 1 and 
0 X = • • • = 9 r _ x (= 6) are the symmetry (S) model (Bowker, 
1948), and the conditional symmetry (CS) model 
(McCullagh, 1978), respectively. 

Agresti (1983) considered the linear diagonals- 
parameter symmetry (LDPS) model defined by 

\ 0l % (*' < 7 ). 

p,j U 

where </>- = </>- . This model is a special case of the DPS 
model. 

Tomizawa (1987) considered the 2-ratios-parameter 
symmetry (2RPS) model defined by 

[fpO’-'hi 0 < j), 

A 

where (j) jj = <p Jt . Special cases of this model obtained by 
putting 0 = 1 and ep = 1 are the CS and LDPS models, 
respectively. 

Tomizawa (1990) considered the polynomial 
diagonals-parameter symmetry (PDPS) model defined 
by 

o' < j), 

Pij = k =° 

kj 0 > j). 
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where q L = </y, . This model is a generalization of the S, 
CS, LDPS and 2RPS models and another expression of 
the DPS model, see Tomizawa (1990). Special cases of 
this model obtained by putting 9 0 = 9 X = • • • = 0 r _ 2 = 1, 

9 \ = @2 = " ' = @ r - 2 = 1 ) #0 = #2 =••• = @ r -2 = ^ anC ^ 

0 2 = 6, = • • • = 0 r _ 2 = 1 are the S, CS, LDPS and 2RPS 
models, respectively. 

For analyzing the data in square tables, when certain 
model does not hold, we are interested in finding 
which cell influences the lack of the structure of the 
model. The incomplete S and incomplete CS models 
are considered by Tomizawa and Tokunaga (2006) and 
the incomplete DPS model is considered by Kurakami, 

Fujimura and Tomizawa (2013) which have a structure 
of the corresponding S, CS and DPS, respectively, 
except a specified pair of cells (u, v) and (v,m), where 
1 < u < v < r . 

The present paper proposes the incomplete PDPS 
model (and the incomplete LDPS and 2RPS models) 
which has the structure of PDPS for the partial cells of 

off-diagonals cells in the table. It also gives the 
decomposition of the incomplete S model into the 
incomplete LDPS model and the incomplete mean 
equality model. 


by putting 0 O = #!=••• = 0 r _ 2 = 1 , 6 l = 0 2 = ••• = d r _ 2 = 1 , 
0 O =6*2 ="• = 9,._ 2 = 1 and 9 2 = 9 2 = ■ ■ • = 9,_ 2 = 1 are the 
S (u,v) , CS (m,v) , LDPS(w,v) and 2 RPS(m,v) models, 
respectively. The LDPS(«,v) and 2RPS(«,v) models 
indicate the structure of the incomplete LDPS and 
incomplete 2RPS, respectively. Note that the 
LDPS(t/,v) and 2 RPS(m,v) models are new models. 
The LDPS(m,v) model is defined by 

_ 1 0 J ~'</>ij 0 < j, 0, j) * (K, V», 

1 " 0 :> j,(i,j)*(v,u)), 

where <j>~ = tf> ]l . A special case of this model obtained 
by putting 9 = 1 is the S(u,v) model. The 2 RPS(m,v) 
model is defined by 

_ | (, O' < j. O', j) * (H, V», 

1,1 [faj 0 i>j,(i,j)*(v,u)), 

where <l> jj = • Special cases of this model obtained by 

putting 6 = I and cp = 1 are the CS(m, v) and LDPS(m,v) 
models, respectively. We point out that the PDPS(m, v) 
with no restriction between parameters {6^} is 
equivalent to the DPS(m,v) because the PDPS with no 
restriction is equivalent to the DPS. 


Incomplete Polynomial Diagonals-Parameter 
Symmetry Model 

Consider the incomplete PDPS model as follows; for a 
fixed (u,v ) , where 1 < u < v < r , 


<t>ij if 9k 1 O' < j, (i, j) * (u, v)), 

< k= 0 

<t>ij (i> j,(ij)*(v,u)). 


where = <j) n . This model has the structure of PDPS 

for the partial cells of off-diagonal cells except a 
specified pair of cells (m,v) and (v,m) in the table. We 
denote this model by PDPS(w,v) . This model states 
that the probability that an observation will fall in a 
cell (i, j) for i < j except the pair of cells (t/,v) and 

(v,u) is times higher than the probability 

that it falls in the cell (j,i) . Note that the PDPS model 
implies the PDPS(m,v) model. For analyzing the data, 
when the PDPS model does not hold, it may be 
possible to find which cell influences the lack of the 
structure of PDPS by applying various PDPS(m,v) 
models. Special cases of the PDPS(w,v) model obtained 


Decomposition of Incomplete Symmetry 
Model 

Let X and Y denote the row and column variables, 
respectively. Tahata, Yamamoto and Tomizawa (2008), 
and Tahata, Yamamoto and Tomizawa (2013) gave the 
decomposition of the S model into the LDPS and the 
mean equality (ME) model. The ME model is defined 
by E(X) = E(Y) . For a fixed (u,v ) , where 1 < u < v < r , 
we define the incomplete ME (ME(w,v)) model by 
E(X I (X, Y) * (u,v)or(v,u)) = E(Y I (X,Y)* (u,v)or(v,u)). 
We obtain the following theorem. 

Theorem 1. For a fixed (u,v ) , where 1 < u < v < r , the 
S(u,v ) model holds if and only if both the LDPS(u,v ) and 
ME(u,v ) models hold. 

Proof. Let 


= - Pa (O', j) * (u, v), (v, u)), 
c 

C= SS/V 
The S (u,v) model is expressed as 
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TABLE 1 NUMBERS OF DEGREES OF FREEDOM FOR MODELS. 


Models 

Degrees of freedom 

S 

r(r — l)/ 2 

CS 

(r + l)(r-2)/2 

PDPS 

(r-l)(r-2)/2 

LDPS 

(r + l)(r-2)/2 

2RPS 

(r 2 -r- 4)/2 

S (u,v) 

(r + l)(r-2)/2 

CS (u, v) 

(r 2 -r- 4)/2 

PDPS(w,i>) 

r(r - 3) / 2 

LDPS(w,i>) 

(r 2 -r- 4)/2 

2RPS(w,i>) 

(r + 2)(r - 3) / 2 

ME(w,i>) 

1 


Qij = Qji (0/ j) * («,v),(v,«)). 

The ME(m,v) model is expressed as 

'L'L(iq ij - jqij) = 0. 

q,j)*(u,v),(v,u) 

This may be expressed as 

I l Uj-i)(q i j-q ji ) = 0. (1) 

( i,j)*(u,v ) 

If the S(m,v) model holds, then the LDPS(«,v) and the 

ME(m,v) models hold. Conversely, if the LDPS(m,v) 
and the ME(m,v) models hold, then the equation (1) is 
expressed as 

SZO'-0(^ _i - 1)^' =0, 

’<j 

( i,j)*(u,v ) 

where </>- satisfies (/>- = ^ v . Thus we obtain 9 = 1 . 
Namely the S(m,v) model holds. The proof is 
completed. 


Goodness-of-fit Test 


Assume that a multinomial distribution is applied to 
the rxr table. The maximum likelihood estimates 
(MLEs) of expected frequencies under the incomplete 
PDPS model could be obtained using an iterative 
procedure, for example, the Newton-Raphson method 
in the log-likelihood equations. Let n :j denote the 

observed frequency in the i th row and j th column of 
the rxr table. Let m :] denote the corresponding 
expected frequency {i = 1, . . . , r; j = 1, . . . , r) and let 
denote the MLE of under the model. The 

likelihood ratio chi-squared statistic for testing the 
goodness-of-fit of the model is 


G = 2 X X log 


i= 1 7=1 


V v / 


TABLE 2 MOTHER'S EDUCATION BY FATHER'S EDUCATION; FROM 
MULLINS AND SITES (1984) FOR A SAMPLE OF EMINENT BLACK 
AMERICANS. THE PARENTHESIZED VALUES ARE THE MLES 
OF THE CS(1, 3) MODEL. 


Mother's 

Education 

Father's Education j 


(1) 

(2) 

(3) 

(4) 

Total 

(1) 

81 

(81.00) 

3 

(6.71) 

9 

(9.00) 

11 

(12.64) 

104 

(2) 

14 

(10.29) 

8 

(8.00) 

9 

(6.32) 

6 

(4.74) 

37 

(3) 

43 

(43.00) 

7 

(9.68) 

43 

(43.00) 

18 

(16.59) 

111 

(4) 

21 

(19.36) 

6 

(7.26) 

24 

(25.41) 

87 

(87.00) 

138 

Total 

159 

24 

85 

122 

390 


NOTE: (1) 8th GRADE OR LESS, (2) PART HIGH SCHOOL, (3) HIGH SCHOOL, 
(4) COLLEGE 


The numbers of degrees of freedom for various PDPS 
and PDPS (u,v) models are described in Table 1. 

Example 

The data in Table 2, taken directly from Mullins and 
Sites (1984) for a sample of eminent black Americans 
defined as persons having biographical sketch in the 
publication YJho's Who Among Black Americans, 
describes the cross-tabulating the mother with the 
father on educational attainment. 

We see from Table 3 that the S, CS, PDPS, LDPS and 
2RPS models fit these data poorly. So, we shall apply 
various incomplete models. We see from Table 3 that 
the CS(1, 3), PDPS(1, 3), LDPS(1, 3), 2RPS(1, 3), PDPS(1, 
2) and PDPS(2, 4) models fit these data well, however, 
the other models fit these data poorly. Therefore, for 
example, we can see that the poor fit of the CS model 
(i.e., the PDPS model with 6 X = 0 2 = 1 ) is caused by the 
lack of the structure of CS for the pair of cells (1, 3) and 
(3, 1). Under the CS(1, 3) model, the MLE of 9 0 is 0.65 
and the MLE of p 13 /p 31 is 0.21. Note that the 
approximate 95% confidence interval of 9 0 is 
[0.41,0.89], with the standard error 0.12. Therefore, 
under this model, the probability that an individual's 
mother's educational attainment is i and his/her 
father's attainment is j (> i ) , where (i, j) * (1, 3) , is 
estimated to be 0.65 times higher than the probability 
that an individual's mother's attainment is j and 
his/her father's attainment is i , and the probability 
that an individual's mother's attainment is (1) and 
his/her father's attainment is (3) is estimated to be 0.21 
times higher than the probability that an individual's 
mother's attainment is (3) and his/her father's 
attainment is (1). 
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TABLE 3 VALUES OF G 2 APPLIED TO THE DATA IN TABLE2. 


Models 

Degrees of freedom 

G 2 

S 

6 

36.18* 

CS 

5 

15.40* 

PDPS 

3 

10.96* 

LDPS 

5 

14.82* 

2RPS 

4 

14.25* 

S(l, 2) 

5 

28.46* 

S(l, 3) 

5 

12.01* 

S(l, 4) 

5 

33.00* 

S(2, 3) 

5 

35.93* 

S(2, 4) 

5 

36.18* 

S(3, 4) 

5 

35.32* 

CS(1, 2) 

4 

13.25* 

CS(1, 3) 

4 

6.72 

CS(1, 4) 

4 

15.35* 

CS(2, 3) 

4 

11.26* 

CS(2, 4) 

4 

13.76* 

CS(3, 4) 

4 

12.89* 

PDPS(1, 2) 

2 

5.98 

PDPS(1, 3) 

2 

5.81 

PDPS(2, 3) 

2 

8.76* 

PDPS(2, 4) 

2 

5.81 

PDPS(3, 4) 

2 

10.63* 

LDPS(1, 2) 

4 

10.64* 

LDPS(1, 3) 

4 

7.01 

LDPS(1, 4) 

4 

11.41* 

LDPS(2, 3) 

4 

13.12* 

LDPS(2, 4) 

4 

12.85* 

LDPS(3, 4) 

4 

14.70* 

2RPS(1, 2) 

3 

10.64* 

2RPS(1, 3) 

3 

6.53 

2RPS(1, 4) 

3 

10.96* 

2RPS(2, 3) 

3 

11.08* 

2RPS(2, 4) 

3 

12.27* 

2RPS(3, 4) 

3 

12.85* 

ME(1, 2) 

1 

17.52* 

ME(1, 3) 

1 

4.97* 

ME(1, 4) 

1 

21.77* 

ME(2, 3) 

1 

22.04* 

ME(2, 4) 

1 

22.49* 

ME(3, 4) 

1 

19.99* 


* means the 5% significant 


Therefore, we can see that the pair of cells (1, 3) and (3, 
1) influences the lack of the structure of CS. We point 
out that the CS model fits the data poorly but the CS(1, 
3) model fit the data well, and so the CS(1, 3) model is 
useful for seeing the reason of the poor fit of the CS 
model as described above. Under the other models, 
similar explanations can be obtained although those 
are omitted. 

The PDPS model fits these data poorly, however, each 
of PDPS(1, 2), PDPS(1, 3) and PDPS(2, 4) models fits 
these data well. Therefore the pair of cells (1, 2) and (2, 
1) (or the pair of (1, 3) and (3, 1), or the pair of (2, 4) 
and (4, 2)) may influences the lack of the structure of 

PDPS. 


The S(l, 3) and ME(1, 3) models fit these data poorly, 
however, the LDPS(1, 3) fits these data well. Therefore, 
it is seen from Theorem 1 that for these data, the poor 
fit of the S(l, 3) model is caused by the influence of the 
lack of structure of the ME(1, 3) model. 

Conclusion 

When the PDPS model fits the data poorly, the 
incomplete PDPS model (i.e., the PDPS(m,u) model) 
and Theorem 1 would be useful for finding which pair 
of cells influences the lack of the structure of PDPS 
(including the structure of S, CS, LDPS, 2RPS, and 
DPS). 
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